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G ' Abstract 

o 

Analysing stationary point databases to extract phenomenological rate con- 

^ ' stants can become time-consuming for systems with large potential energy bar- 

riers. In the present contribution we analyse several different approaches to this 

^D ' problem. First, we show how the original rate constant prescription within the 

-- , ' discrete path sampling approach can be rewritten in terms of committor proba- 

-4— > ' 

C^ , 

Ch \ bilities. Two alternative formulations are then derived in which the steady-state 

'^ ' assumption for intervening minima is removed, providing both a more accurate 

Q \ kinetic analysis, and a measure of whether a two-state description is appropri- 



ate. The first approach involves running additional short kinetic Monte Carlo 
(KMC) trajectories, which are used to calculate waiting times. Here we in- 
troduce 'leapfrog' moves to second-neighbour minima, which prevent the KMC 
trajectory oscillating between structures separated by low barriers. In the second 
approach we successively remove minima from the intervening set, renormalising 
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the branching probabihties and waiting times to preserve the mean first-passage 
times of interest. Regrouping the local minima appropriately is also shown to 
speed up the kinetic analysis dramatically at low temperatures. Applications are 
described where rates are extracted for databases containing tens of thousands 
of stationary points, with effective barriers that are several hundred times ksT. 

1 Introduction 

A great deal of effort is currently focused on the development of new methods to treat 
'rare events'. A number of these methods employ a coarse-graining approximation 
of some kind to the phase space. ^"^ Examples include the interface formulation^ of 
transition path sampling,^' ^ Markovian state models,^ milestoning,^ master equation 
approaches,^' ^ and discretized reaction paths. ^ The present contribution focuses on the 
discrete path sampling (DPS) approach, ^"^^ which produces a database of stationary 
points from the underlying potential energy surface (PES). Several new methods are 
developed for extracting phenomenological two-state rate constants from this database. 
As well as speeding up the kinetic analysis, we can also determine the extent to which 
a two-state description is appropriate. 

The particular coarse-graining approach considered here focuses on a formally exact 
partitioning of the PES into the basins of attraction'^ of all the local minima. ^'^ The cor- 
responding superposition approach to thermodynamics is based upon the same division 
of the partition function. ^°'^^ Kunz and Berry^^ extended this scheme to treat kinet- 
ics by including transition states, which are here defined geometrically, as stationary 
points that possess a single negative Hessian eigenvalue. ^^ They employed statistical 
rate theory to calculate individual minimum-to-minimum rate constants, and a master 
equation approach^^' ^^ to extract global dynamics. Unfortunately, the number of sta- 
tionary points is generally expected to scale exponentially with system size,^^'^''' and we 
must therefore derive an appropriate sampling scheme in order to represent the kinetics 



properly. The DPS approach was introduced specifically to deal with this problem.^' ^° 
To start a DPS analysis we must have an order parameter that enables us to identify 
local minima, aGA and bGB, which belong to the two end-point states of interest, A 
and B. Minima that do not belong to either set will be labelled iGl, where T stands 
for 'intervening'. Local equilibrium must be established within each of the A and 
B regions relative to their interconversion rate for two-state kinetics to apply. The 
original DPS formulation then showed how phenomenological two-state rate constants 
k^B and k^A could be written as a sum over discrete paths between A and B, so long 
as minima in the I set could be treated according to the steady-state approximation.^ 
Here a discrete path is defined as a sequence of local minima and the transition states 
that connect them. Starting from an initial discrete path that connects A and B, 
the DPS procedure generates new discrete paths by a directed search for additional 
transition states and local minima that can be merged with existing paths. For simple 
cases, where only a few discrete paths make a significant contribution, /cab and A;ba 
are obtained from summing the corresponding terms. ^ However, for more complicated 
pathways there may be many terms that contribute. In particular, the databases 
of stationary points that are constructed during the search for new paths generally 
contain vastly more A^^B discrete paths than are explicitly considered during the 
sampling procedure. However, all these contributions can be summed using a matrix 
multiplication approach based upon results from graph theory.^ In fact these sums are 
equivalent to the committor probabilities discussed in ^ and it is more efficient to 
calculate them using the successive overrelaxation technique. ^^ 

Overall rates can also be extracted from DPS stationary point databases using mas- 
ter equation^^ and kinetic Monte Carlo^^"^^ (KMC) methods.^ These approaches do not 
require the steady-state approximation for minima in the I set, and hence they enable 
us to check whether this approximation is valid. Unfortunately, numerical problems 
often arise in the master equation approach,^ although grouping and pruning the sta- 



tionary point database may help to circumvent this issue. ^'''^^ KMC calculations do not 
experience the same difficulties, but can become very slow if the potential energy bar- 
rier between the A and B regions is large compared to ksT .'^^''^'^ Importance sampling 
KMC schemes have recently been suggested to overcome these problems, ^^' ^^ but we 
have not succeeded in applying them to databases of the complexity considered in the 
present work. Instead we have developed the combined committor probability/KMC 
and graph transformation schemes described in the following sections. 

2 Two- State Rate Constant Expressions 

We start from a linear master equation formulation of the kinetics, ^^'^^ which immedi- 
ately invokes the assumption of Markovian dynamics: 
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where Pa{t) is the occupation probability of minimum a at time t and kap is the rate 
constant for a ^ (3 transitions between minima a and /5, which are directly connected 
by a transition state on the PES. We assume that minima within the A and B sets are 
in local mutual equilibrium, so that 

P.{t) = ?^^ and P.(t) = 3^, (2) 

where Pxif) = XlaeA -^al^), etc., and the superscript 'eq' stands for 'equilibrium'. In the 
original derivation of two-state phenomenological rate constants within the DPS ap- 
proach we also applied the steady-state approximation for each intervening minimum, 
i. These approximations enable us to replace the probabilities -Pi(t) in an iterative 

fashion to obtain 
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and an analogous expression for /i;|^. Here the sum is over all discrete paths that 
begin at any minimum bGB and end at any minimum aGA, and the prime denotes 
a restriction that only intervening minima iGl can be revisited. The 'SS' superscript 
is introduced to distinguish this result, which includes the steady-state approximation 
for minima in the I set. Each factor of the form ky^/ ^^ k^i can be written in terms 
of the corresponding branching probability, Pji = k-^\/ 'Y^ k^i, where the sum in the 
denominator runs over all minima 7 directly connected to i: 
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where Tb = 1/ ^^ k^h is the mean waiting time for escape from minimum b to any of its 

neighbouring minima that are linked to it by a single transition state. This expression 

can be written in a much simpler form by recognising that the sum of the products of 

branching probabilities over all paths from minimum b to minima in the A region that 

revisit only I minima is the committor probability, C^: 

I 

^b = /_^ Pail -'1112 ■ ■ ■ -P in-iin -n^b, (5) 
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For example, C^ is the probability that a stochastic trajectory started from minimum 
b will encounter an A minimum before a B minimum, while C^ = 1 — C^ is the 
probability that a trajectory will encounter the B region before A. The parameter 
P^°^'^, defined as the probability that a protein will fold before unfolding, starting from 
some initial condition a,^'^^'^^ is a more specific example of a committor probability. 

To derive corresponding expressions without invoking the steady-state approxima- 
tion for minima in the I set we write the master equation in terms of transitions between 
members of the A and B sets and the corresponding effective rate constants A'ab and 



^AB = J^ / . / . -^ab-Pfa = 7^ / ,-^Ab-Pb ' 

(7) 
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where the superscript 'NSS' stands for 'non-steady-state'. KAb is the effective rate 
constant for transitions from minimum b to the A minima, and Ksa, is the corresponding 
rate constant for transitions from minimum a to the B minima; the minima in question 
will not generally be connected by a single transition state. Treating transitions from 
minimum b to the A and B regions as independent Poisson processes with the above 
rate constants yields a mean waiting time between transitions of tb = V(-^Ab + -^Bb)- 
Here Ksh corresponds to the effective rate constant for a trajectory to return to any 
member of the B set starting from b. However, KAh/{KA_h + -K^Bb) = K^hth can be 
identified with the committor probability C^, and hence we obtain 

1 /^A pcq 1 ^B peq 
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The only difference between the expressions in equations (jHI) and (jH)) is that the average 
waiting times between events are interpreted differently. In (jHl) tb is the mean waiting 
time between events corresponding to transitions between b and any minima in AUB, 
including return to b itself. However, in Tb is simply the mean waiting time for a 
transition from minimum b to a minimum connected to it by a single transition state. 
Clearly Tb < tb- Furthermore, in the steady-state limit for the intervening minima, the 
corresponding waiting times t\ must be negligible, so that tb — ^ ^b, and /c^|^ -^ fcfg, 
as expected. 

To determine the committor probabilities C^ we use the relation (a first-step anal- 
ysis^'') 

which is analogous to the expression used for P^°''^ in reference [4]. When calculating 
/c|^ and y^x "^6 simply replace the 'A' superscripts by 'B' in equation 0. The sum 



over /3 includes all minima directly connected to minimum a by a single transition 
state, as these are the only non-zero branching probabilities. The C^ were calculated 
iteratively using the successive overrelaxation technique with an extrapolation factor 
of 1.999.^^ The sparse matrix whose elements are the branching probabilities Ppa was 
represented using the compressed row storage scheme. ^^ 

As in previous DPS calculations we can use harmonic densities of states to calculate 
the equilibrium occupation probabilities, along with expressions from statistical rate 
theory^^ for the individual rate constants, A;q,^, corresponding to transitions between 
directly connected minima. Once the committor probabilities have been calculated 
we have all the quantities required to evaluate k^^ and A;|^. To calculate the mean 
waiting time tb for transitions from b G B to any A or B minimum, including revisits 
to I minima, we employ the kinetic Monte Carlo (KMC) approach. ^^"^^ Stochastic 
trajectories are simply run from minimum b until we reach any minimum belonging to 
the A or B sets, and tb is evaluated as an average over a number of independent trials. 

The above formulation in terms of the waiting times tb niay have significant advan- 
tages over the alternative KMC approach, where stochastic trajectories are followed 
from minimum b until they reach an A minimum. This methodology would provide 
the mean first-passage time from b to A, Tb, from which we could calculate the overall 
rate constants as 
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However, if there is a large potential energy barrier between the A and B regions then 
the KMC trajectory will revisit minima in the B region many times before finally reach- 
ing an A minimum. Tb will then be dominated by a vast number of unsuccessful b' ^ b 
transitions.^^ In contrast, the KMC trajectories employed to calculate tb terminate 
when they hit either an A or a B minimum, and are generally very short when there 
is a large barrier. The vast majority of KMC trajectories return to the B region after 
a small number of steps in this situation. 



The rate constant formulations that employ committor probabilities in (jHl) and (jH)) 
will be more efficient that the expressions that use the mean ffist-passage time in (fTUI) 
if we can calculate the C^ or C^ faster than Tb or T.^. Our experience suggests that 
this is indeed the case, and that the corresponding speedup can be very large. Once the 
required committor probabilities are known it is also possible to compare the results 
from equations (jH)) and (jH)) to provide some measure of whether two-state kinetics are 
appropriate. 

3 'Leapfrog' Moves in Kinetic Monte Carlo 

At low temperatures we have found that the KMC runs used to calculate Tb and tb niay 
involve an inordinate number of recrossings for pairs of I minima separated by very 
low barriers.^ This problem was addressed in previous work by evaluating the escape 
probability and associated waiting time analytically for such pairs. ^ A more general 
scheme was employed in the present work by calculating the probability and waiting 
times associated with jumps to second-neighbours for each local minimum ('leapfrog' 
moves) . A further development involves an expansion of the A and B regions to include 
other local minima, which uses a disconnectivity graph analysis,^°'^°'^^ as described in 

SI 

Consider a particular minimum i, with adjacent minima (connected directly by a 
single transition state) labelled by an index j, and second-neighbours labelled by index 
k^^i. For all minima j ^ A U B we must allow for an arbitrary number of i <-> j 
recrossings before escape to a second neighbour, and the recrossings can occur in any 
order. Recrossings to minima j G A U B are not allowed, but escape to such minima 
must be included. Transitions i— ^j ^AUB— i-k^^i and i — ^ j G A U B can 
therefore be broken down into independent events, where the first event involves i <-h> j 
recrossings and the second event involves the actual escape. The following derivation 
is not applicable if neighbour j has a direct connection to another minimum, j', that is 



also adjacent to i, since then we would need to account for arbitrary j <-> j' recrossings 
as well. A more general scheme, which includes such effects, is described in ^ 

We now define new second neighbour branching probabilities, Pj^j, which include all 
possible recrossings between i and j ^ A U B. Consider paths from i to k that include 
rij recrossings i ^-* j for each j ^ A U B, with n = ^. riy The number of distinct paths 
of this kind is n!/ Hi '^j'; ^^^ so the total probability of reaching k from i is 



P' 



J^AUB / n=0 rij \ j0AUB j^AuB 



Vj^AuB / n=0 Vj^AuB 

i^AuB \ J^AUB / 
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where we have used the multinomial theorem,^^ and the prime for the sum over Uj 
indicates a restriction to nj > and ^: Uj = n, with j ^ A U B. Similarly, the total 
probability of reaching a minimum j G A U B from i is 



V J0AUB / 



(12) 



It is easily verified that the new branching probabilities out of i sum to unity, as they 
should. 

The mean waiting time for a transition from i to any second neighbour, k, or 
adjacent minimum, j G A U B, is written as t[. To calculate t' we note that every 
step associated with a branching probability Pa/B adds rg to the duration of a path, on 
average. If we replace each branching probability Pa/3 by P^/? = Pa/s^wiC'T/s) then the 
time associated with any path, multiplied by its probability, can be obtained from 
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(13) 
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In the present work r/ and the corresponding leapfrog transition probabihties, P^j, 
were calculated before the KMC phase of the calculation for each minimum with no 
direct connections between its first neighbours. Leapfrog moves were used for eligible 
minima in the I set if any branching probability Pai exceeded a threshold value. In 
practice, thresholds ranging from 0.1 to 0.8 were found to work equally well, and gave 
essentially identical results. The corresponding speedup can be very large, as discussed 
in gni 

4 Reclassification of the A, B and I Local Minima 

When reanalysing stationary point databases obtained in previous work,^'^^ situations 
were still encountered where the KMC runs became stuck in extended regions consist- 
ing of local minima separated by small barriers. This observation suggests that minima 
in the B region (for A^B rates) are connected to relatively low-lying I minima by bar- 
riers that are smaller than those for return to a B minimum. Expanding the B region 
to include such minima has a negligible effect on the two-state rate constants. In the 
present work we have defined expansions of the B (and A) regions using the same su- 
perbasin analysis that is employed in constructing disco nnectivity graphs. ^°'^°''^^ The 
local minima are partitioned into disjoint sets (superbasins) , so that at least one path- 
way exists between any two minima of each set that does not exceed a threshold energy, 
Et^. In contrast, any path between minima in different sets must include a transition 
state that lies above Ett- For a given Eth, all local minima in the same superbasin as a 
B minimum were classified as B, and similarly for the A region. When the threshold. 
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Eth, is low enough, this procedure results in no reclassifications; there is also a maxi- 
mum threshold above which A and B minima would share the same superbasin. For 
the databases considered below, the threshold could be varied over quite a wide range 
without changing the rate constants by more than about a factor of two. However, the 
corresponding CPU time can vary by orders of magnitude, as described below. 

5 The Graph Transformation Method 

The main idea of the graph transformation approach is to progressively remove local 
minima from the I set whilst leaving the average properties of interest unchanged for the 
database that remains. The theory is an extension of the results used to perform jumps 
to second neighbours in previous KMC simulations,^ and the leapfrog moves considered 
in ^ which themselves share some common ground with Novotny's 'Monte Carlo with 
absorbing Markov chains' approach. ^'^ The present method extends the approach of 
Bortz, Kalos and Lebowitz^^ to exclude not only the transitions from the current state 
into itself but also transitions involving an adjacent minimum. For example, suppose 
we wish to remove minimum i £ I. Consider KMC trajectories that arrive at minimum 
(3, which is adjacent to i. We wish to step directly from /3 to any minimum in the set 
F that is adjacent to /5 or i, excluding these two minima themselves. To ensure that 
the mean first-passage times between the A and B sets are unchanged we must define 
new branching probabilities, P'a from /5 to all 7 G F, along with a new waiting time 
for escape from j3, tL Here, r^ corresponds to the mean waiting time for escape from 
j3 to any 7 G F, while the modified branching probabilities subsume all the possible 
recrossings involving minimum (3 that could occur before a transition to a minimum in 

F. Hence the new branching probabilities are: 

00 

p^^ = ip,,n^ + p,p) Y,{Pp:Pi^r = iP,iPiP + p,p)/{^ - PpiP^p)- (14) 

m=0 
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This formula also applies if either P^^p or P^i vanishes. When calculating mean first- 
passage times for transitions from the B to A regions we do not consider branching 
probabilities out of A minima, and similarly for branching probabilities out of B minima 
when considering A to B transitions. However, P'^ and ri for /3 G A are never used in 
calculating the A ^ B rates, and similarly for /3 G B and B ^ A rates. Hence we can 
use equation ()14|1 for all minima /3 G A U B U I, and obtain results for both A <— B and 
B ^ A rate constants at the end of the procedure. Detailed balance can then be used 
as a consistency check. 

It is easy to show that the new branching probabilities are normalised: 



^ 1 - PpiPip 1 - P/3iPi/3 

To calculate r^ we use the method of Sec. El 



-', 






_ Tp + Pj^Tj 

1 — P/3iPi/3 
C=0 



(16) 



The modified branching probabilities and waiting times could be used in a KMC sim- 
ulation based upon the stationary point database with minimum i excluded. Since the 
modified branching probabilities, P'^, subsume the sums over all paths from /5 to 7 
that involve i we would expect the probability that a trajectory starting at b G B and 
ending at a G A is conserved. Here we consider for specificity the /cab rate constant, 
corresponding to paths that start from B minima and terminate at an A minimum. 
In contrast to the formulation of /c^g and /cab^; above, revisits to B minima are now 
allowed, as for the conventional KMC calculations that yield /c™^ in equation (fTTHl . 
Since each trajectory exiting from 7 G F acquires a time increment equal to the average 
value, r^, the contributions to the mean first-passage time from individual A minima 
are not conserved (unless there is a single A minimum). Nevertheless, the overall mean 
first-passage time to A is conserved, i.e. %, = T^, where the prime denotes a trans- 
formed quantity. To prove these results consider the effect of removing minimum i 
on trajectories reaching minimum (3, which is initially connected to i, from minimum 
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b G B. The total probability of a pathway terminating at a if it starts from b is the 
sum of the product of branching probabilities, Wg = -Pa^n-P^^^^.i • • --P^Si-^Cib' o^^r all 
the corresponding paths ,^ G a ^ b: 

oo 

€ea<-b i^es' ^leis^b 7er m=o 52ea^7 

?36=' €iG/3<-b 76r 6ea^7 

(17) 
and similarly for any other minimum adjacent to i and any other pathway that revisits 
/3 other than by immediate recrossings of the type /3 — ^ i — *> /3. H is the ensemble of all 
paths for which probabilistic weights cannot be written in the form defined by the last 
term of the above equation. Here H' is the ensemble of all paths which probabilistic 
weights cannot be written in the form defined by the last term of the above equation. 
Hence the transformation preserves the individual probabilities Xlfga^-b^?- 

Now consider the effect of removing minimum i on the contribution to the mean 
first-passage time from b G B to A using the approach of Sec. 121 
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Cie/3^b 



6e/3<-b 7gr 



C=o 



dc 



En. E ^6 

(=0 7Gr 52ea^7 



E >^6E 

6e/3^b 7Gr 



HP' 

dC 



(18) 
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J (=0 6Ga^7 



+ E ^cEK, E 

flG/3<-b 7Gr ^2 6a^7 



dW, 



6 



dC 



C=o 



where the tildes indicate that every branching probability Pa/3 is replaced by P^/je^'^'^, 
as in ^ The first and last terms are unchanged from the original database in this 
construction, but the middle term. 



E '^..E 

eie/3^b 7er 



dC 



E ^^^ 

(=0 6Ga«-7 



P^\Pip{Tp + Ti) + P^pirp + P/SiPi/sn) 



(19) 
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is different (unless there is only one A minimum). However, if we sum over A minima 
then XlaeA Xlf2ea^7 ^6 = 1 ^^ ^11 7' ^^^ we can now simplify the sum over 7 as 

The same argument can be applied whenever a trajectory reaches a minimum adjacent 
to i, so that 7^' = %, as required. 

At low temperature some of the Pap approach unity, and numerical problems arise 
in calculating terms like 1 — PapPfia- However, precision can be regained by writing 

Pap = 1 — 2_^ P^P = 1 — eap and Ppa = 1 — 2_^ A-a = 1 ~ C/Ja, (21) 

77^a 77^/3 

and then using 1 - PapPpa = ea/3 - ^apepa + e/3a- 

Further applications of the graph transformation method will be described else- 
where in the context of Markov chains considered as digraphs. ^^ In particular, we note 
that more general formulations are possible so that waiting times involving revisits to 
arbitrary sets of minima can be included. The main advantage of this theory is that 
there is an upper bound to the maximum operation count involved in removing all 
the I minima. In contrast to the successive overrelaxation technique used to calculate 
committor probabilities, described in ^ the graph transformation does not depend 
upon satisfying any convergence criteria. Once all the I minima have been removed 
the only transformed branching probabilities remaining, P^^, are the relative probabil- 
ities of paths from A and B minima terminating at any of the other A and B minima. 
The final waiting times, t{^, correspond to the average time before a transition to any 
other A or B minimum starting from b, and similarly for the r^. Here we use a single 
prime to denote the final values of transformed quantities, although in general they 
may change many times as the I minima are successively removed. The associated rate 
constants can then be calculated as 



AB = ^^ 2^ —J- Z^ PLh and /cba = -^ 2^ -J- 2_^ Phs.- (22) 
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The transformed probabilities and waiting times in equation ()22|1 are different from 
the committor probabihties and waiting times that appear in the definitions of k^ 
and k^^, and from the waiting times that appear in equation p()|l for fc™'". It is 
also noteworthy that we have avoided the steady-state approximation for the I min- 
ima. Since the computational cost of the graph transformation does not change as the 
temperature decreases, in contrast to the successive overrelaxation calculations and 
conventional KMC runs, this approach becomes increasingly advantageous at low tem- 
perature. In fact it is also possible to define transformations that enable us to evaluate 
/c^l^ and /c^B^^ precisely in a fixed number of operations; these results will be presented 
else where. ^^ 

6 Results 

The formulations in equations (jHl), ©, and P^ were tested for stationary point 
databases obtained in previous DPS runs for permutational isomerization and mor- 
phological transitions of four different atomic clusters bound by the Lennard- Jones 
(LJ) potential. ^^ In each case Arrhenius fits of /cab and /cba were presented in previous 
work,^ and only for the cluster containing 55 atoms, LJ55, are the results significantly 
different here. For the three other examples (LJ13, LJ38 and LJ75) the effective barrier 
heights changed by less than 0.05 e, and the Arrhenius prefactors changed by a factor 
of three or less. Here e is the pair well depth for the LJ potential. The new results for 
LJ55 are therefore considered in more detail below. 

The global potential energy minimum for LJ55 is a Mackay icosahedron,^^ which 
exhibits special stability and 'magic number' properties. ^^'^^ There are four distinct 
sites for a tagged atom in the global minimum: one in the centre, one in the middle 
shell, and two in the outer shell. ^ DPS calculations in previous work considered the 
rate for migration of the tagged atom between the centre and either one of the sur- 
face sites. ^° Even at the solid-like/liquid-like melting transition temperature for this 
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cluster the potential energy barrier between these A and B states is around 50 ksT. 
Temperatures down to ksT/e = 0.04 were considered in the present work, where the 
barrier is 350 ksT. Standard KMC calculations for this system previously proved to 
be unfeasible below about ksT /e = 0.3.^ 

A disconnectivity graph that distinguishes permutation-inversion isomers of the 
tagged atom is shown in Figure [U Separate branches appear for the two distinct 
surface sites, but these minima were grouped together in state B for the rate con- 
stant calculations. The branch for group A, at the right of the graph, has the tagged 
atom at the centre of the icosahedron. The calculated two-state rate constants do not 
vary significantly over a wide range of E^^, and KMC averages for tb employing 1000 
trajectories are generally sufficient. For higher thresholds the A and B sets start to 
overlap, indicating that the highest energy transition state on the lowest energy path 
between the A and B regions lies approximately 14 e above the global minimum. Using 
a fractional convergence tolerance of 10"'^ for the total rate in the committor probabil- 
ity calculations and KMC averages over 1000 trajectories per B minimum, the entire 
analysis of the database requires only a few seconds of CPU time on an UltraSparcIII 
900 MHz processor. 

The temperature dependence of both k^^ and k^^, for the surface-to-centre and 
centre-to-surface rates, can be fitted quite accurately (coefficient of determination, 
i?2 = 0.99999) by the Arrhenius form k = a exp{-A/kBT) for 0.04 < keT/e < 0.3: 

kAB- a = 1.38 X lO^z/Lj, A = 14.05 e, 

(23) 

kBA- a = 6.52 X IOVlj, A = 14.04 e. 



where the unit of frequency is z/lj = ^JejMo^^ with M the atomic mass and 2^/^cr 
the equilibrium pair separation. Symmetry requires that 42/i;AB = ^ba when only 
permutation-inversion isomers of the global minimum including a tagged atom are 
included in the A and B sets.^ The above results deviate a little from this ideal 
ratio because there are 77 minima in the A set and 1763 minima in the B set using 
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reclassification at a threshold energy of Eth = — 270e. At ksT/e = 0.1 the centre-to- 
surface rate constant has order 10~^° s~^ for parameters appropriate to argon. 

The above rates are significantly faster than those obtained from a summation over 
DPS paths in previous work,^ indicating that the latter sums were not converged. In 
particular, the value of the effective barrier, A, in the Arrhenius fit is about 4e lower, 
and now coincides closely with the barrier corresponding to the highest transition state 
on the lowest energy path between the A and B regions. The present results therefore 
supersede the values obtained in reference [11]. 

A more detailed analysis of the results for LJ55 (Tables ^0)) reveals a number of 
trends that are likely to be useful in future work, especially regarding the choice of ii^th 
and two-state kinetics. At ksT/e = 0.4 we find that k^, k^^ and fc^g agree well for 
— 267e > -Eth > —270 e (TableU]). These values also agree with /c™^ calculated without 
any regrouping, although the latter simulation requires about a thousand times more 
CPU time. It is also interesting to note that the calculated values in Table Q] exhibit 
jumps when Eth changes from —270 e to —271 e, from —273 e to —274 e, and from —275 e 
to —276 e. Changes in behaviour occur at the same points for lower temperatures, 
as described below, and the cause is easily identified with the aid of Figure Q For 
Eth = —270 e the Mackay icosahedra corresponding to both minima in the original B 
set, and the C^y minimum with the tagged atom in the intermediate shell, all lie in 
the same superbasin. According to the reclassification scheme, all the local minima 
belonging to this superbasin are then classified as type B. However, at Eth = —271 e 
the branch corresponding to the tagged atom in the intermediate shell splits off from 
the branch containing the original B minima, so there is a qualitative change in the 
classification of local minima between these energies. Nevertheless, the overall A^-h-B 
kinetics are not disturbed dramatically, because equilibrium between the remaining B 
minima and the region that is now classed as 'intervening' is still established rapidly 
compared to the time scale for transitions between A and B. As Figure Q clearly shows. 
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the corresponding potential energy barrier for A<->B interconversion is about 5 e larger 
than for equilibration with this I region. It is also noteworthy that the relation 42/i;AB = 
ksA is obeyed to better than 0.2% for Eth = —TJX e, because the A and B sets are not 
expanded as much as for E'th = —270 e, where the Arrhenius fit was performed (above). 

On changing E^\^ from — 273e to — 274e the two icosahedra with the tagged atom 
in the surface separate into different superbasins. Some local minima corresponding 
to icosahedra with a surface vacancy and adatom pair also separate into their own 
branches, and are therefore assigned to the I set. This regrouping again affects fc^g, 
^AB^5 and A;2b ; but the difference is only a factor of about two. Nevertheless, the change 
in fc^l^ below E^^ = —270 e suggests that a two-state description is less appropriate for 
the classification of A, B and I sets corresponding to lower energy thresholds. 

When the temperature is lowered to ksT/e = 0.3 we again see changes in the 
calculated rate constants at the same i?th boundaries as for ksT/e = 0.4. However, 
there is one important change from the results at ksT/e = 0.4. A much tighter 
convergence condition is required in the calculation of committor probabilities for Eth < 
—271 e at the lower temperature, and the required CPU time jumps by a factor of about 
100. This effect can again be explained from the corresponding classification of the A, 
B and I sets with reference to the disconnectivity graph. The committor probability 
calculation requires many more iterations to converge for i^th "£ —271 e because there 
are I minima with very low values of Cf^. The relative CPU time required to calculate 
/c^B is now significantly smaller, since it is essentially unchanged from ksT/e = 0.4. 

For kBT/e = 0.2 the benefit of leapfrog KMC moves becomes visible (Table 01), 
producing a speedup factor of between ten and a few hundred. An even tighter con- 
dition is required to converge the committor probability calculation for Eth < —271 e, 
with a corresponding dramatic increase in the time required. At ksT/e = 0.1 (Table 
E} KMC runs without leapfrog moves were not feasible, and only k^ could be calcu- 
lated for Eth < —271 e. Note that the timings for /c^g are practically independent of 



temperature, which makes the graph transformation approach highly advantageous at 
low temperatures. 

7 Conclusions 

In this contribution we have shown how the original discrete path sampling (DPS) 
rate expressions, which involve the steady-state approximation for 'intervening' local 
minima, can be written in terms of committor probabilities. Analogous expressions can 
be derived without the steady-state approximation by simply changing the waiting time 
associated with escape from each B (for /cab) or A (for /cra) minimum. In the steady- 
state expressions the appropriate waiting times correspond to transitions to directly 
connected minima, involving a single transition state. In contrast, when the steady- 
state approximation is removed, the required waiting times correspond to transitions 
to any member of the A or B sets, and can be calculated from relatively short KMC 
runs. If the waiting times for escape from each intervening local minima to an adjacent 
minimum are small then the steady-state limit is recovered, as expected. 

Committor probabilities can be calculated using successive overrelaxation tech- 
niques, and this approach is significantly faster than the direct sum over paths of 
increasing length used in previous work to calculate kj^ and /cIa-^'^^ The KMC runs 
required to calculate waiting times for transitions to either the A or B region, and hence 
/c^B^ or /^BA^) are generally very short. In contrast, direct KMC simulations based on 
mean first-passage times involve trajectories that can only terminate in the product 
region. When the A and B states are separated by a high potential energy barrier the 
latter trajectories will generally revisit the reactant region many times. The alterna- 
tive formulations based upon committor probabilities may then be advantageous, but 
require leapfrog KMC moves and reclassification of the A, B and I sets to be feasible 
at low temperatures. 

The most powerful method that we have found for extracting phenomenological rate 
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constants is the graph transformation approach. Intervening local minima are succes- 
sively removed, and the branching probabilities and waiting times for the remaining 
local minima are renormalised to preserve the mean first-passage time between the A 
and B regions. At the end of the transformation only A and B minima remain, and 
the waiting time in each one is the average value for a transition to any of the other 
A and B minima. The advantage of this approach is that the time taken to perform 
the transformation does not depend upon temperature, so it can be used when the 
processes of interest are arbitrarily slow. 

The main conclusion that we draw from the detailed analysis of LJ55 in ^is that 
the threshold energy, En^, for reclassification of A and B minima should be chosen 
above the potential energy where any likely kinetic traps branch off. This choice of 
threshold should not affect the two-state dynamics of interest significantly, so long as 
the potential energy difference between the top of any possible traps and the highest 
transition state on the lowest A ^^ B path is large compared to ksT. In this case there 
will be a clear separation of time scales for relaxation between the A and B regions and 
within the regions themselves, so that a two-state description is still applicable. ^^'^° 
The disconnectivity graph approach^''' ^^ provides a helpful way to recognise such sit- 
uations. ^° Both the graph transformation theory and leapfrog KMC moves combined 
with an appropriate choice of -Eth can produce exponential speedups in the kinetic 
analysis, providing access to temperatures where the potential energy barrier is more 
than a hundred times larger than ksT. 

References 

[1] T. S. van Erp, D. Moroni and P. G. Bolhuis, J. Chem. Phys. 118, 7762 (2003). 
[2] C. Dellago, P. Bolhuis and P. L. Geissler, Advances Chem. Phys. 123, 1 (2002). 



20 



[3] P. G. Bolhuis, D. Chandler, C. Dellago and P. L. Geissler, Annu. Rev. Phys. 
Chem. 53, 291 (2002). 



[4] 
[5] 
[6] 
[7] 



[9 
[10 



[11 
[12 

[13; 

[14 

[is; 

[16; 

[17; 



N. Singhal, C. D. Snow and V. S. Pande, J. Chem. Phys. 121, 415 (2004). 

A. K. Faradjian and R. Elber, J. Chem. Phys. 120, 10880 (2004). 

S. Sriraman, I. G. Kevrekidis and G. Hummer, J. Phys. Chem. B 109, 6479 (2005). 

K. N. Kudin and R. Car, J. Chem. Phys. 122, 114108 (2005). 

A. Berezhkovskii and A. Szabo, J. Chem. Phys. 121, 9186 (2004). 

D. J. Wales, Mol. Phys. 100, 3285 (2002). 

D. J. Wales, Energy Landscapes: Applications to Clusters, Biomolecules and 
Glasses, Cambridge University Press, Cambridge (2003). 

D. J. Wales, Mol. Phys. 102, 883 (2004). 

F. H. Stillinger and T. A. Weber, Science 225, 983 (1984). 

R. E. Kunz and R. S. Berry, J. Chem. Phys. 103, 1904 (1995). 

J. N. Murrell and K. J. Laidler, Trans. Faraday Soc. 64, 371 (1968). 

N. G. van Kampen, Stochastic Processes in Physics and Chemistry, Elsevier, Am- 
sterdam (1981). 

R. E. Kunz, Dynamics of First- Order Phase Transitions, Deutsch, Thun (1995). 

D. J. Wales and J. P. K. Doye, J. Chem. Phys. 119, 12409 (2003). 



[18] W. H. Press, B. P. Flannery, S. A. Teukolsky and W. T. Vetterling, Numerical 
Recipes in Fortran: The Art of Scientific Computing, Cambridge University Press, 
Cambridge (1992). 

[19] A. B. Bortz, M. H. Kalos and J. L. Lebowitz, J. Comput. Phys. 17, 10 (1975). 

21 



[20] A. F. Voter, Phys. Rev. B 34, 6819 (1986). 

[21] K. A. Fichthorn and W. H. Weinberg, J. Chem. Phys 95, 1090 (1991). 

[22] D. A. Evans and D. J. Wales, J. Chem. Phys. 119, 9947 (2003). 

[23] W. Cai, M. H. Kalos, M. de Koning and V. V. Bulatov, Phys. Rev. E 66, 046703 
(2002). 

[24] M. de Koning, W. Cai, B. Sadigh, T. Oppelstrup, M. H. Kalos and V. V. Bulatov, 
J. Chem. Phys. 122, 074103 (2005). 

[25] R. Du, V. S. Pande, A. Y. Grosberg, T. Tanaka and E. I. Shakhnovich, J. Chem. 
Phys. 108, 334 (1998). 

[26] C. D. Snow, E. J. Sorin, Y. M. Rhee and V. S. Pande, Ann. Rev. Biophys. Biomol. 
Struct. 34, 43 (2004). 

[27] H. M. Taylor and S. Karlin, An introduction to stochastic modelling, Academic 
Press, Orlando (1984). 

[28] Z. Bai, J. Demmel, J. Dongarra, A. Ruhe and H. van der Vorst (eds.). Tem- 
plates for the Solution of Algebraic Eigenvalue Problems: a Practical Guide, SIAM, 
Philadelphia (2000). 

[29] K. J. Laidler, Chemical Kinetics, Harper & Row, New York (1987). 

[30] O. M. Becker and M. Karplus, J. Chem. Phys. 106, 1495 (1997). 

[31] D. J. Wales, M. A. Miller and T. R. Walsh, Nature 394, 758 (1998). 

[32] S. Goldberg, Probability: An Introduction, Dover Publications, New York (1960). 

[33] M. A. Novotny, Phys. Rev. Lett. 74, 1 (1995). 

[34] S. A. Trygubenko and D. J. Wales, in preparation (2005). 

22 



[35] J. E. Jones and A. E. Ingham, Proc. R. Soc. A 107, 636 (1925). 

[36] A. L. Mackay, Acta Cryst. 15, 916 (1962). 

[37] R. S. Berry, T. L. Beck, H. L. Davis and J. Jellinek, Adv. Chem. Phys. 70B, 75 

(1988). 

[38] P. Labastie and R. L. Whetten, Phys. Rev. Lett. 65, 1567 (1990). 

[39] D. Chandler, J. Chem. Phys. 68, 2959 (1978). 

[40] C. Dellago, P. G. Bolhuis and D. Chandler, J. Chem. Phys. 110, 6617 (1999). 



23 



Tables 

Table 1: Results for LJ55 at ksT/e = 0.4 for tb waiting times averaged over 1000 KMC 
trajectories and a fractional convergence criterion of 10"'^ in the committor probability cal- 
culation. CPU times (seconds) are given in brackets for an UltraSparclll 900 MHz processor. 
The two timings in the fc^g column refer to runs with/without leapfrog moves. For com- 
parison, k}^*^ = 0.48 X 10^^ averaged over 1000 trajectories, and this calculation required 
about 4800 s on an UltraSparclll 900 MHz processor both with and without leapfrog KMC 
moves. All rate constants are in reduced units of ^lj. 

Eth/6 4|X10^ klfxW klf/kf^ kfl^W 



-266 


1.02(1) 


0.50(8/10) 


0.50 


0.50(7) 


-267 


0.46(1) 


0.41 (7/9) 


0.89 


0.34(8) 


-268 


0.46(1) 


0.41 (6/6) 


0.90 


0.34(9) 


-269 


0.46(1) 


0.45 (5/5) 


0.96 


0.38(8) 


-270 


0.46(1) 


0.44 (4/5) 


0.95 


0.38(9) 


-271 


0.53(1) 


0.16(11/8) 


0.29 


0.14(22) 


-272 


0.54(3) 


0.15(6/5) 


0.28 


0.15(19) 


-273 


0.58(3) 


0.16(5/5) 


0.27 


0.15(22) 


-274 


1.20(3) 


0.23 (4/4) 


0.19 


0.24(25) 


-275 


1.20(3) 


0.23 (4/4) 


0.19 


0.24(25) 


< -276 


2.27(3) 


0.42 (4/3) 


0.18 


0.42 (24) 
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Table 2: Results for LJ55 at ksT/e = 0.3 for tb waiting times averaged over 1000 KMC 
trajectories, and fractional convergence criteria in the committor probability calculation of 
lO^"' for E'th ^ — 270e and 10~^ for £'th < —271 e. CPU times (seconds) are given in brackets 
for an UltraSparcIII 900 MHz processor. The two timings in the kp^ column refer to runs 
with/without leapfrog moves. For comparison, k^^'~^ = 0.13 x 10"^^ averaged over 1000 
trajectories, and this calculation required about 2.0 x 10^ s and 2.6 x 10^ s on an UltraSparcIII 
900 MHz processor for KMC runs with and without leapfrog KMC moves, respectively. All 
rate constants are in reduced units of z^lj- 



E,^/e 


kf^ X 10" 


klf X W^ 


lNSS/lSS 
'^AB I'^AB 


kfl X 1011 


-266 


0.19(2) 


0.16(3/4) 


0.85 


0.16(7) 


-267 


0.13(1) 


0.12(2/2) 


0.96 


0.10(8) 


-268 


0.13(1) 


0.12(2/2) 


0.96 


0.10(9) 


-269 


0.12(1) 


0.12(1/2) 


1.00 


0.11(8) 


-270 


0.12(1) 


0.12(1/2) 


0.99 


0.10(9) 


-271 


0.13(122) 


0.11(125/135) 


0.84 


0.038 (22) 


-272 


0.13(128) 


0.11(126/129) 


0.83 


0.038(19) 


-273 


0.13(128) 


0.12(129/129) 


0.87 


0.040 (22) 


-274 


0.16(146) 


0.10(147/147) 


0.61 


0.067(25) 


-275 


0.16(147) 


0.10(147/147) 


0.61 


0.067(25) 


< -276 


0.17(152) 


0.13(152/152) 


0.77 


0.10(24) 
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Table 3: Results for LJ55 at ksT/e = 0.2 for tb waiting times averaged over 1000 KMC 
trajectories, and fractional convergence criteria in the committor probability calculation of 
lO^-'^ for EtY, > -268 e, 10""^ for -270 e < Eti, < -269 e, and lO"'^ for Eth = -271 e. The kf^ 
result for ii^th = —271 e is probably not converged even for this tighter convergence criterion, 
and should be considered a lower bound for comparison with fe^g- CPU times (seconds) 
are given in brackets for an UltraSparcIII 900 MHz processor. The two timings in the A;^! 
column refer to runs with/without leapfrog moves. Conventional KMC calculations were not 
feasible at this temperature. All rate constants are in reduced units of i^lj. 

Eth/6 41 X 10^2 kffxlO^^ 4IV4I ^2^x10^2 



-266 


0.79(2) 


0.74(2/635) 


0.93 


0.75(9) 


-267 


0.65(1) 


0.65(2/36) 


1.00 


0.65(10) 


-268 


0.65(1) 


0.65(2/28) 


1.00 


0.65(10) 


-269 


0.65(1) 


0.65(2/219) 


1.00 


0.65(10) 


-270 


0.65(1) 


0.65 (2/206) 


1.00 


0.65(10) 


-271 


0.092(119331) 


— 




0.021 (26) 


-272 


— 


— 




0.022(23) 


-273 


— 


— 




0.023(27) 


-274 


— 


— 




0.044(31) 


-275 


— 






0.045(31) 


; -276 


— 


— 


— 


0.60 (29) 
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Table 4: Results for LJ55 at ksT/e = 0.1 for tb waiting times averaged over 1000 KMC 
trajectories, and a fractional convergence criterion in the committor probability calculation 
of 10"^ for E'th ^ — 270 e. For thresholds of £'th < — 271 e it proved impossible to converge 
the committor probability calculation for the fractional convergence criteria required to yield 
meaningful results. CPU times (seconds) are given in brackets for an UltraSparcIII 900 MHz 
processor. The timings in the kp^ column refer to runs with leapfrog moves; the correspond- 
ing runs without leapfrog moves were not feasible. Conventional KMC calculations were also 
unfeasible at this temperature. All rate constants are in reduced units of t'Lj. 



EthA 41 X 10^2 fcNss^io^^ klf/kf^ kflxlQ 
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-266 


0.14(2) 


0.12(3) 


0.87 


0.12(8) 


-267 


0.14(2) 


0.14(544) 


1.00 


0.14(9) 


-268 


0.14(2) 


0.14(3) 


1.00 


0.14(11) 


-269 


0.14(3) 


0.14(3) 


1.00 


0.14(10) 


-270 


0.14(3) 


0.14(20) 


1.00 


0.14(10) 


-271 


— 


— 


— 


0.012(26) 


-272 


— 


— 


— 


0.012(23) 


-273 


— 


— 


— 


0.013(25) 


-274 


— 




— 


0.075 (25) 


-275 


— 






0.075 (25) 


< -276 


— 


— 


— 


0.13(24) 
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-264 
-265 - 
-266 
-267 - 
-268 
-269 - 
-270 
-271 - 
-272 - 
-273 
-274 ^ 
-275 
-276 ^ 
-277 
-278 ^ 
-279 



VI, 




B Cs^ 



B C2v 



Figure 1: Disconnectivity graph for LJ55 in which permutation-inversion isomers of the 
shaded atom are distinguished. Permutation-inversion isomers of the other atoms are grouped 
together for every minimum and transition state. The tagged atom can occupy four distinct 
sites in the icosahedral global minimum, producing four separate branches, which are labelled 
according to the appropriate point group symmetry and A/B region (before any reclassifi- 
cation). The DPS stationary point database contains 5,609 minima and 10,134 transition 
states,^ but the graph includes only the lowest 250 minima for clarity. The potential energy, 
y, is in units of e. 
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